Module 3 - Data Tidying

Learning Objectives

  • Converting between wide and long data formats with tidyr
  • Changing the shape of your data with dplyr
  • Importing data into R

Readings

Additional Resources:

This vignette has some even more in depth discussion than the textbook.

The original Tidy Data paper is one of the most influential papers on data processing. It relates Tidy Data to database normalization, which might be familiar to you if you are an expert in SQL. The accompanying vignette has code used to generate the results in the paper. If you do read the paper, note that the definition of tidy data is different than the one in your textbook. The original definition is close to a relational database style of organization, while the current definition is more of a flat file approach that resembles a data matrix and which is more readily analyzed.

This is a classic data visualization tool that we will use in this week’s homework

A useful package that gets a little bit more at the trade-offs between different data organization schemes. We don’t use it this week, but you could use it solve parts of the last problem more efficiently.

Videos

There are some excellent optional video lectures that discuss tidy data and data wrangling: